Efficient Frequent Pattern Mining on Web Logs

نویسندگان

  • Liping Sun
  • Xiuzhen Zhang
چکیده

Mining frequent patterns fromWeb logs is an important data mining task. Candidate-generation-and-test and pattern-growth are two representative frequent pattern mining approaches. We have conducted extensive experiments on real world Web log data to analyse the characteristics of Web logs and the behaviours of these two approaches on Web logs. To improve the performance of current algorithms on mining Web logs, we propose a new algorithm – Combined Frequent Pattern Mining (CFPM) to cater for Web log data specifically. We use heuristics to prune search space and reduce costs in mining so that better efficiency is achieved. Experimental results show that CFPM significantly improves the performance of the pattern-growth approach by 1.2–7.8 times on mining frequent patterns from Web logs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Discovery of Frequent Patterns using KFP-Tree from Web Logs

Frequent pattern discovery is a heavily focused area in data mining. Discovering concealed information from Web log data is called Web usage mining. Web usage mining discovers interesting and frequent user access patterns from web logs. This paper contains a novel approach, based on k-mean and frequent pattern tree (FP-tree), for frequent pattern mining from Weblog data.

متن کامل

Effective web log mining and online navigational pattern prediction

The web has become the world's largest repository of knowledge. Web usage mining is the process of discovering knowledge from the interactions generated by the user in the form of access logs, cookies, and user sessions data. Web Mining consists of three different categories, namely Web Content Mining, Web Structure Mining, and Web Usage Mining (is the process of discovering knowledge from the ...

متن کامل

Efficient Frequent Pattern Mining on Web Log Data

Mining frequent patterns from web log data can help to optimise the structure of a web site and improve the performance of web servers. Web users can also benefit from these frequent patterns. Many efforts have been done to mine frequent patterns efficiently. Candidate-generation-and-test approach (Apriori and its variants) and pattern-growth approach (FP-growth and its variants) are the two re...

متن کامل

Web Usage Mining: users' navigational patterns extraction from web logs using ant-based clustering method

Web Usage Mining is the process of applying data mining techniques to the discovery of usage patterns from data extracted from Web Log files. It mines the secondary data (web logs) derived from the users' interaction with the web pages during certain period of Web sessions. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. In this paper, w...

متن کامل

Mining Constraint-based Multidimensional Frequent Sequential Pattern in Web Logs

In this paper we introduce an efficient strategy for discovering Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Web usage mining consists of three phases, namely preprocessing, pattern discovery, and pattern analysis. This paper describes each of these phases in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004